Variable-Sized Map and Locality-Aware Reduce on Public-Resource Grids

نویسندگان

  • Po-Cheng Chen
  • Yen-Liang Su
  • Jyh-Biau Chang
  • Ce-Kuen Shieh
چکیده

This paper presents a grid-enabled MapReduce framework called “Ussop”. Ussop provides its users with a set of C-language based MapReduce APIs and an efficient runtime system for exploiting the computing resources available on public-resource grids. Considering the volatility nature of the grid environment, Ussop introduces two novel task scheduling algorithms, namely: Variable-Sized Map Scheduling (VSMS) and Locality-Aware Reduce Scheduling (LARS). VSMS dynamically adjusts the size of the map tasks according to the computing power of grid nodes. Moreover, LARS minimizes the data transfer cost of exchanging the intermediate data over a wide-area network. The experimental results indicate that both VSMS and LARS achieved superior performance than the conventional scheduling algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predoop: Preempting Reduce Task for Job Execution Accelerations

Map/Reduce is a popular parallel processing framework for data intensive computing. For overlapping the Map task’s execution phase and the Reduce task’s intermediate data fetching and merging phase, existing Map/Reduce schedulers always pre-launch the Reduce task at the specific threshold where its map tasks have been launched, and this pattern incurs the occupation of the consuming resources o...

متن کامل

Scalable community-driven data sharing in e-science grids

E-science projects of various disciplines face a fundamental challenge: thousands of users want to obtain new scientific results by applicationspecific and dynamic correlation of data from globally distributed sources. Considering the involved enormous and exponentially growing data volumes, centralized data management reaches its limits. Since scientific data are often highly skewed and explor...

متن کامل

Effect of green human resource management practices on environmental sustainability

In today’s world, green human resource management is one of the most important factors in forward-thinking your environment-friendly business. Most of the researchers are of the view that employees must be empowered and environmentally aware of greening while carrying out green human resource management practices.  The present study is examining the impact of different Green human resource prac...

متن کامل

Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop d...

متن کامل

Boosting MapReduce with Network-Aware Task Assignment

Running MapReduce in a shared cluster has become a recent trend to process large-scale data analytics applications while improving the cluster utilization. However, the network sharing among various applications can lead to constrained and heterogeneous network bandwidth available for MapReduce applications. This further increases the severity of network hotspots in racks, and makes existing ta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010